Domain Specific Hierarchical Huffman Encoding

نویسندگان

  • K. Ilambharathi
  • G. S. N. V. Venkata Manik
  • N. Sadagopan
  • B. Sivaselvan
چکیده

In this paper, we revisit the classical data compression problem for domain specific texts. It is well-known that classical Huffman algorithm is optimal with respect to prefix encoding and the compression is done at character level. Since many data transfer are domain specific, for example, downloading of lecture notes, web-blogs, etc., it is natural to think of data compression in larger dimensions (i.e. word level rather than character level). Our framework employs a two-level compression scheme in which the first level identifies frequent patterns in the text using classical frequent pattern algorithms. The identified patterns are replaced with special strings and to acheive a better compression ratio the length of a special string is ensured to be shorter than the length of the corresponding pattern. After this transformation, on the resultant text, we employ classical Huffman data compression algorithm. In short, in the first level compression is done at word level and in the second level it is at character level. Interestingly, this two level compression technique for domain specific text outperforms classical Huffman technique. To support our claim, we have presented both theoretical and simulation results for domain specific texts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

فشرده سازی اطلاعات متغیر با زمان با استفاده از کد هافمن

Abstract: In this paper, we fit a function on probability density curve representing an information stream using artificial neural network . This methodology result is a specific function which represent a memorize able probability density curve . we then use the resulting function for information compression by Huffman algorithm . the difference between the proposed me then with the general me...

متن کامل

SPIHT Algorithm with Huffman Encoding for Image Compression and Quality Improvement over MIMO OFDM Channel

In this paper, Compression and improving the Quality of images during the transmission using SPIHT algorithm combined with Huffman encoding over OFDM channel has been proposed. Initially decompose the image in to different level, the compressed coefficients are arranged in descending order of priority and mapped over the channels. The coefficients with lower importance level, which are likely t...

متن کامل

A novel technique for image steganography based on Block-DCT and Huffman Encoding

Image steganography is the art of hiding information into a cover image. This paper presents a novel technique for Image steganography based on Block-DCT, where DCT is used to transform original image (cover image) blocks from spatial domain to frequency domain. Firstly a gray level image of size M × N is divided into no joint 8 × 8 blocks and a two dimensional Discrete Cosine Transform(2-d DCT...

متن کامل

A Novel Technique for Image Steganography Based on DWT and Huffman Encoding

Image steganography is the art of hiding information into a cover image. This paper presents a novel technique for Image steganography based on DWT, where DWT is used to transform original image (cover image) from spatial domain to frequency domain. Firstly two dimensional Discrete Wavelet Transform (2-D DWT) is performed on a gray level cover image of size M × N and Huffman encoding is perform...

متن کامل

Compression Combined Robust Watermarking Scheme using SVD Replacement Technique

This paper proposes a novel compression combined digital image watermarking scheme based on singular value replacement technique. Image compression is achieved using Huffman encoding technique. Huffman encoding is an entropy encoding algorithm offering lossless image compression. The proposed watermarking scheme combines Integer wavelet transform (IWT) with singular value decomposition (SVD). F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1307.0920  شماره 

صفحات  -

تاریخ انتشار 2013